Quantitative Biology
○ Wiley
Preprints posted in the last 90 days, ranked by how well they match Quantitative Biology's content profile, based on 11 papers previously published here. The average preprint has a 0.02% match score for this journal, so anything above that is already an above-average fit.
Cui, T.; Wang, Z.; Wang, T.
Show abstract
AI-based molecular dynamics simulation brings ab initio calculations to biomolecules in an efficient way, in which the machine learning force field (MLFF) locates at the central position by accurately predicting the molecular energies and forces. Most existing MLFFs assume localized interatomic interactions, limiting their ability to accurately model non-local interactions, which are crucial in biomolecular dynamics. In this study, we introduce ViSNet-PIMA, which efficiently learns non-local interactions by physics-informed multipole aggregator (PIMA) and accurately encodes molecular geometric information. ViSNet-PIMA outperforms all state-of-the-art MLFFs for energy and force predictions of different kinds of biomolecules and various conformations on MD22 and AIMD-Chig datasets, while adapting the PIMA blocks into other MLFFs further achieves 55.1% performance gains, demonstrating the superiority of ViSNet-PIMA and the universality of the model design. Furthermore, we propose AI2BMD-PIMA to incorporate ViSNet-PIMA into AI2BMD simulation program by introducing "Transfer Learning-Pretraining-Finetuning" scheme and replacing molecular mechanics-based non-local calculations among protein fragments with ViSNet-PIMA, which reduces AI2BMDs energy and force calculation errors by more than 50% for different protein conformations and protein folding and unfolding processes. ViSNet-PIMA advances ab initio calculation for the entire biomolecules, amplifying the application values of AI-based molecular dynamics simulations and property calculations in biochemical research.
Wen, K.; Zha, J.; Chen, S.; Zhong, J.; Yuan, L.; Cui, Y.; Shi, X.; Qin, W.; Lan, X.; Liu, Y.; Yang, X.; Qin, H.; Li, M.; Guo, P.; Xiao, Q.; Wu, T.; Zhou, Y.; Cao, C.; Ning, S.; Wu, C.; Gao, Q.; He, H.; Ma, Y.; An, Z.; Liu, X.; Chen, Y.; Zheng, Z.; Wei, H.; Ma, Y.; Zhang, J.
Show abstract
Coherent Ising machines (CIMs) excel at solving large-scale combinational optimization problems (COPs), but their insufficient long-term stability has hindered their applications in compute-intensive tasks like computer-aided drug discovery (CADD). By improving fiber vibration isolation and temperature control system, we have implemented a 2000-node CIM named QBoson-CPQC-3Gen achieving stable solutions over one hour on large-scale COPs. Graph-based encoding schemes were further introduced to realize a CIM-based CADD workflow including allosteric site detection, protein-peptide docking and intermolecular similarity calculation. CIM-based methods demonstrated superior speed and accuracy than heuristic algorithms. Especially, QBoson-CPQC-3Gen identified 2 novel druggable sites and bioactive compounds for 6 targets, which were further validated in vitro, in-cell and by crystal structures. Our contributions established a quantum-computing framework for multi-stage drug discovery, representing a significant advancement in both quantum computing applications and pharmaceutical research.
Yan, J.; Wu, Q.; Li, Y.; Cai, J.; Zhou, M.; CACPbell-Valois, F.-X.; Siu, S. W.
Show abstract
Cancer remains a major global health threat, with its incidence and mortality rates consistently rising in recent years. Anticancer peptides (ACPs) are short amino acid chains that can inhibit the growth or spread of cancer cells. Compared to traditional treatments, ACPs are a promising class of potential cancer therapies due to their multiple mechanisms, potential for combination cancer therapy, enhanced immune function, lower toxicity to normal tissues, fewer side effects, and less drug resistance. Although it is necessary to explore novel ACPs, traditional wet-lab methods for selecting them are labor-intensive, time-consuming, and expensive. To accelerate the discovery of novel ACPs, we proposed Diffusion-ACP39, a latent diffusion-based generative model with synchronized seed autoencoder for anticancer peptide design, capable of generating novel peptides with lengths ranging from 5 to 39 amino acids. Furthermore, we developed RF-ACP39, a random forest classifier model to assess the generative power of Diffusion-ACP39. Finally, Diffusion-ACP39 achieved an accuracy of 94.5% when generating 10,000 peptides with RF-ACP39. We also qualitatively analyzed the differences among true ACPs, random sequences, random peptides, and generated ACPs, demonstrating that the generated ACPs are most similar to true ACPs.
Li, S.; Wang, Y.; Shu, Z.; Grima, R.; Jiang, Q.; Cao, Z.
Show abstract
Biochemical reactions are inherently stochastic, with their kinetics commonly described by chemical master equations (CMEs). However, the discrete nature of molecular states renders likelihood-based parameter inference from CMEs computationally intensive. Here, we introduce an inference method that leverages analytical solutions in the probability generating function (PGF) space and systematically evaluate its efficiency, accuracy, and robustness. Across both steady-state and time-resolved count data, our numerical experiments demonstrate that the PGF-based method consistently outperforms existing approaches in terms of both computational efficiency and inference accuracy, even under data contamination. These favorable properties further enable the extension of the PGF-based framework to model selection--a task typically considered computationally prohibitive. Using timeresolved data, we show that the method can correctly identify complex gene expression models with more than three gene states, a task that cannot be reliably achieved using steady-state data alone.
Mukherjee, S.; Srivastava, D.; Patra, N.
Show abstract
Protein-DNA complexes are involved in vital cellular functions like gene regulation, replication, transcription, packaging, rearrangement, and damage repair. In this work, streamlined geometric formalism for computing the absolute binding free energy was used to obtain chemical accurate in silico estimation of binding free energy of three Protein-DNA complexes. Additionally, molecular interactions between Protein and DNA involved hydrogen bonds, electrostatic, van der Waals, and hydrophobic interactions. Using this formalism, researcher can obtain the absolute binding free energy for a Protein-DNA complex with remarkable accuracy and modest computational cost.
Maroilley, T.; Barbosa, V. R. A.; Mascarenhas, R.; Ferris, S.; Diao, C.; AlAwadhi, F.; Aldakheel, S.; Ali, A.; Alkanderi, D.; Alshatti, M.; Alsuwaileh, S.; Asghar, K.; Bui, R.; Chai, B.; Dsouza, L.; Nezhad, P. E.; Garcia-Volk, E.; Haq, Z.; Hossain, S.; Johnson, G.; Kotikalapudi, N.; Lalani, I.; Lenz, C.; Louie, T.; Moore, S.; Patel, S.; Prasai, S.; Qureshi, R.; Rahmani, F.; Shakir, B.; Ahamed, S. S.; Tran, H. A.; Waziha, R.; Wood, C. M.; Zbinden, S.; Anderson, D.; Tarailo-Graovac, M.
Show abstract
Bioinformatics, a discipline at the crossroads of Biology and Computational Sciences, also referred to as Computational Biology, is nowadays widely spread in research programs. However, implementing any Bioinformatics projects requires the ability to comprehend biological concepts and apply computational approaches, and rare are the undergraduate programs offering such multi-disciplinary training. In addition, understanding the dynamic between Biology research projects and Bioinformatics analyses is challenging with no real-life experience. Course-based undergraduate research experience (CURE) courses are innovative programs that allow more students to acquire research experience and provide the perfect setting to introduce students to applied bioinformatics. As a part of the Bachelor of Health Sciences of the Cumming School of Medicine at the University of Calgary (Canada), a CURE applied bioinformatics was implemented in the Winter of 2023 to 2025. Students investigated the effect of structural variants (SVs, genetic variants larger than 50 bp) on gene expression in the model organism Caenorhabditis elegans (a hermaphrodite 1-mm long roundworm). The students detected and characterized SVs by analyzing genome and transcriptome sequencing data of C. elegans strains called balancers, as they are known to carry large genomic variations balancing regions of the genome by limiting recombination and allowing maintenance of lethal mutations. They used Galaxy, a public web-based supercomputing resource, but also a local High-Performance computing system, and R, to report different effects of SVs on gene expression and splicing. Students research explained the molecular mechanism behind the uncoordinated phenotype caused by the reciprocal translocation eT1(III;V) and uncovered unexpected effects on gene expression on an understudied gene. We evaluated the courses impact on student learning journeys and showed that the CURE favored students understanding of the Bioinformatics field and fostered their research interest. We provide here guidelines to facilitate the CURE implementations to improve access for undergraduate students to bioinformatics research experiences.
Gu, X.
Show abstract
Our recent work on molecular evolution and population genetics postulated that individuals with a specific mutation exhibit a fluctuation in fitness, short for FSI (fluctuating selection among individuals), whereas the fitness effect of wildtype remains a constant. An intriguing phenomenon called selection-duality emerges, that is, a slightly beneficial mutation could be a negative selection (the substitution rate less than the mutation rate). It appears that selection-duality is bounded by two bounds: the generic neutrality where the mutation is neutral by the means of fitness on average, and the substitution neutrality where the substitution rate equals to the mutation rate. In addition, the middle point of generic neutrality and substitution neutrality is called the FSI-neutrality. An important problem is about the age profile of allele frequency, i.e., the arising timing of a mutation whose frequency in the current population is given (the allele-age problem for short). Solving this problem under selection duality would help extend the standard coalescent theory that based on strict neutrality to a more general form under selection duality. In this paper, we studied the allele-age problem under selection-duality by the first arrival time approach and the mean age approach, respectively. Since the general solution of allele-age problem under selection duality is not available, we focused on solving the problem at the substitution neutrality (the up-bound of selection duality), the FSI-neutrality (the middle-point) and the generic neutrality (the low-bound), respectively. Our analysis results in an overall picture that the mean first-arrival age of a mutation at the substitution neutrality is theoretically identical to that at the FSI-neutrality, which is numerically close to that at the generic neutrality. For illustration, we calculated the mean age of nonsynonymous mutations in the human population and demonstrated that the estimated allele-age could be overestimated considerably when the effect of FSI was neglected.
Wu, Y.
Show abstract
Intercellular communication is governed by the spatiotemporal dynamics of protein complexes at the cell-cell interface. However, conventional static interaction models fail to incorporate key physical constraints, such as steric hindrance, spatial compartmentalization, and dimensionality reduction that regulate complex assembly in vivo. To bridge the gap between static network topology and dynamic systems biology, we developed a multi-scale computational framework. We first identified a highly conserved, Fibroblast Growth Factor Receptor 1 (FGFR1)-centered cell adhesion and signaling motif by analyzing a diverse set of human cell-cell interfaces. We then constructed a multi-layer spatial stochastic simulator to recapitulate and interrogate the dynamic behavior of this network motif at cell-cell interfaces. Atomic-resolution structural models of the protein complexes within the motif were further generated using AlphaFold to define interaction rules for the stochastic simulations by categorizing binding interfaces. Our results show that the structural arrangement of cell-cell adhesion complexes controls how FGFR1 receptors cluster at the cell-cell interface, effectively dividing the membrane into distinct functional microdomains. Competition from decoy receptors further regulates this process by capturing receptors before they can participate in signaling. Even small changes in binding affinity can therefore alter receptor organization and disrupt normal signal transduction, which may contribute to human disease. By integrating macro-scale interactomics, atomic-level structural bioinformatics, and mesoscale stochastic modeling, this study reveals how structural interaction rules, combined with spatial constraints, shape the formation and function of intercellular signaling networks.
Yang, F.; Hanks, E. M.; Conway, J. M.; Bjornstad, O. N.; Thanh, N. T. L.; Boni, M. F.; Servadio, J. L.
Show abstract
Infectious disease surveillance systems in tropical countries show that respiratory disease incidence generally manifests as year-round activity with weak fluctuations and irregular seasonality. Previously, using a ten-year time series of influenza-like illness (ILI) collected from outpatient clinics in Ho Chi Minh City (HCMC), Vietnam, we found a combination of nonannual and annual signals driving these dynamics, but with unknown mechanisms. In this study, we use seven stochastic dynamical models incorporating humidity, temperature, and school term to investigate plausible mechanisms behind these annual and nonannual incidence trends. We use iterated filtering to fit the models and evaluate the models by comparing how well they replicate the combination of annual and nonannual signals. We find that a model including specific humidity, temperature, and school term best fits our observed data from HCMC and partially reproduces the irregular seasonality. The estimated effects from specific humidity and temperature on transmission are nonlinearly negative but weak. School dismissal is associated with decreased transmission, but also with low magnitude. Under these weak external drivers, we hypothesize that stochasticity makes a strong sub-annual cycle more likely to be observed in ILI disease dynamics. Our study shows a possible mechanism for respiratory disease dynamics in the tropics. When the external drivers are weak, the seasonality of respiratory disease dynamics is prone to the influence of stochasticity.
Wang, D.; Froehlich, F.; Stapor, P.; Schaelte, Y.; Huth, M.; Eils, R.; Kallenberger, S.; Hasenauer, J.
Show abstract
Experimental methods for characterizing single cells and cell populations have improved tremendously over the past decades. This progress has enabled the development of quantitative, mechanistic models for cellular processes based on either single cell or bulk data. However, coherent statistical frameworks for the model-based integration of different data types at the single-cell and population levels are still missing. In this work, we present a mathematical modeling approach for integrating single-cell time-lapse, single-cell snapshot, single-cell time-to-event and population-average data. Utilizing a formulation based on nonlinear mixed-effect modeling, we enable the description of multiple data types, with and without single-cell resolution, and we propose a tailored parameter estimation method. Furthermore, we propose a tailored parameter estimation scheme that facilitates the assessment of underlying process parameters. Our study demonstrates that the proposed approach can reliably integrate diverse data types, thereby improving parameter identifiability and prediction accuracy. Applying this framework of extrinsic apoptosis reveals that simultaneously considering multiple data types can be essential, particularly when experimental constraints limit data availability. The proposed approach is broadly applicable and may significantly advance our understanding of complex biological processes.
Ringer McDonald, A.; Vazquez, A. V.
Show abstract
Developing scientific reading skills is critical for undergraduate STEM students due to scientific literatures unique formatting and use of specialized jargon. Generative AI tools such as ChatGPT offer students the ability to ask questions about what they are reading interactively. Previously, we reported the development of a ChatGPT-assisted reading guide that combined structured, active reading strategies with using ChatGPT to clarify unfamiliar words and concepts in real time. In the initial study, undergraduates found the use of the ChatGPT-assisted reading guide helpful in their understanding of an abstract and introduction of a journal article. Here, the ChatGPT-assisted reading guide was used in a journal club assignment for an undergraduate chemistry course. ChatGPT transcripts were analyzed for common types of interactions, and students were surveyed about their experience. Overall, students reported that using the ChatGPT-assisted reading guide was helpful in understanding the article and helped them have more productive class discussions. However, some students also expressed skepticism about using AI tools, citing concerns about accuracy of AI-generated information and the effect of using AI on their own learning.
Su, H.; Liang, Y.; Xiao, W.; Li, H.; Liu, X.; Yang, Z.; Yuan, M.; Liu, X.
Show abstract
The escalating crisis of antimicrobial resistance necessitates novel therapeutic strategies, among which drug combination therapy shows great promise by enhancing efficacy and reducing toxicity. However, identifying effective synergistic pairs from the vast combinatorial space remains experimentally challenging and resource-intensive. To address this, we introduce GCN-Mamba, a deep learning framework that integrates Graph Convolutional Networks (GCN) with the Mamba State Space Model. This architecture captures both local molecular topological structures and global implicit interactions by leveraging Extended 3-Dimensional Fingerprints (E3FP) and bacterial gene expression profiles. Evaluation on a comprehensive dataset demonstrated that GCN-Mamba significantly outperforms classical machine learning models in predictive accuracy. In a targeted case study against Methicillin-resistant Staphylococcus aureus (MRSA), the model successfully rediscovered known synergistic pairs, such as Quercetin and Curcumin, consistent with recent literature. Furthermore, prospective in vitro validation confirmed a novel synergistic combination of Shikimic acid and Oxacillin, validating the models practical utility. By efficiently prioritizing potential candidates, GCN-Mamba serves as a powerful and reliable tool for accelerating the discovery of synergistic antimicrobial combinations, effectively bridging the gap between computational prediction and experimental validation.
Teshirogi, Y.; Terada, T.
Show abstract
Molecular dynamics (MD) simulations are a powerful tool for investigating biomolecular dynamics underlying biological functions. However, the accessible spatiotemporal scales of conventional all-atom simulations remain limited by high computational costs. Coarse-graining reduces these costs by decreasing the number of interaction sites and enabling longer timesteps. In extreme cases, proteins are represented as single spherical particles; while such approximations facilitate cellular-scale simulations, they often sacrifice essential structural information, such as molecular shape and interaction anisotropy. Here, we present CGRig, a rigid-body protein model with residue-level interaction sites designed for long-time, large-scale simulations. In CGRig, each protein is treated as a single rigid-body embedding residue-level interaction sites. Its translational and rotational motions are described by the overdamped Langevin equation incorporating a shape-dependent friction matrix. Intermolecular interactions are calculated using G[o]-like native contact potentials, Debye-Huckel electrostatics, and volume exclusion. We validated that CGRig accurately reproduces the translational and rotational diffusion coefficients expected from the friction matrix for an isolated protein. For dimeric systems, the model successfully maintained native complex structures. Furthermore, two initially separated proteins converged into the correct complex with an association rate consistent with all-atom simulations. Notably, CGRig achieved a simulation performance exceeding 17 s/day for a 1,024-molecule system. These results demonstrate that CGRig provides an efficient framework for simulating protein assembly while retaining residue-level interaction specificity, making it a valuable tool for investigating large-scale biomolecular self-assembly.
Chattaraj, A.; Kanovich, D. S.; Ranganathan, S.; Shakhnovich, E. I.
Show abstract
Phase separated condensates are recognized as a ubiquitous mechanism of spatial organization in cell biology. Biophysical modeling of condensates provides critical insights into the dynamics and functions of these subcellular structures that are difficult to extract via experiments. Here we present an efficient computational pipeline, CASPULE (Condensate Analysis of Sticker Spacer Polymers Using the LAMMPS Engine), to simulate and analyze the biological condensates made of sticker-spacer polymers. CASPULE implements a unique force field that combines traditional Langevin dynamics with a "detailed balance proof" protocol for single-valent bond formation between stickers. This framework allows us to study the non-trivial biophysics that emerge out of the single-valent sticker interactions coupled with the effect of separation in energetic contribution by stickers and spacers. We provide detailed documentation on how to setup the simulation environment, perform simulations and analyze the results. Through case studies, we highlight the utility and efficacy of our pipeline. Importantly, we provide statistical parameters to characterize the cluster size distribution often observed in biological systems. We envision this tool to be broadly useful in decoding the interplay of kinetics and thermodynamics underlying the formation and function of biological condensates.
Ben-Joseph, J.
Show abstract
Lightweight epidemic calculators are widely used for teaching and rapid scenario exploration, yet many omit the methodological detail needed for scientific reuse. We present a browser-native SIR calculator that exposes forward Euler and classical fourth-order Runge--Kutta (RK4) integration alongside epidemiologically interpretable outputs and a population-conservation diagnostic. The implementation is anchored to analytical properties of the deterministic SIR system, including the epidemic threshold, the peak condition, and the final-size relation. Benchmark experiments show that RK4 is essentially step-size invariant over practical discretizations, whereas Euler at a coarse one-day step overestimates peak prevalence by 3.97% and final size by 0.66% relative to a fine-step RK4 reference. These results demonstrate that browser-based tools can support publication-quality computational narratives when solver choice, diagnostics, and assumptions are treated as first-class outputs.
Zhang, H.; Zheng, G.; Xu, Z.; Zhao, H.; Cai, S.; Huang, Y.; Zhou, Z.; Wei, Y.
Show abstract
Missense variants are a common type of genetic mutation that can alter the structure and function of proteins, thereby affecting the normal physiological processes of organisms. Accurately distinguishing damaging missense variants from benign ones is of great significance for clinical genetic diagnosis, treatment strategy development, and protein engineering. Here, we propose the VarDCL method, which ingeniously integrates multimodal protein language model embeddings and self-distilled contrastive learning to identify subtle sequence and structural differences before and after protein mutations, thereby accurately predicting pathogenic missense variants. First, leveraging sequence and structural information before and after mutations, VarDCL generates sequence-structural multimodal features via different language models. It incorporates both global and local perspectives of feature embeddings to provide the model with dynamic, multimodal, and multi-view input data. Additionally, a Self-distilled Contrastive Learning (SDCL) module was proposed to enable more effective information integration and feature learning, enhancing the models ability to detect sequence and structural changes induced by mutations. Within this module, the multi-level contrastive learning framework excels at capturing information differences before and after mutations within the same modality; meanwhile, the feature self-distillation mechanism effectively utilizes high-level fused features to guide the learning of low-level differential features, facilitating information interaction across different modalities. The VarDCL framework not only ensures the models capacity to learn dynamic changes pre- and post-mutation but also significantly improves cross-modal information interaction between sequence and structure, thereby remarkably boosting the models performance in distinguishing pathogenic mutations from benign ones. To validate the effectiveness of VarDCL, extensive experiments were conducted. The ablation study demonstrates that all key components of VarDCL contribute significantly. On an independent test set containing 18,731 clinical variants, VarDCL achieved an AUC of 0.917, an AUPR of 0.876, an MCC of 0.690, and an F1-score of 0.789, outperforming 21 state-of-the-art existing methods. Benchmark analysis shows that VarDCL can be utilized as an accurate and potent tool for predicting missense variant effects.
Yi, J.; Liu, J.; Guo, P.; Ye, Y.-n.; zhou, X.
Show abstract
Rapid advances in single-cell RNA sequencing (scRNA-seq) technology have enabled the investigation of gene expression changes at the single-cell level, particularly for elucidating the heterogeneity among cells and complex biological processes. This technique reveals subtle molecular differences within individual cells, thereby offering a unique viewpoint for the investigation of cell cycle progression, cellular differentiation, and disease pathogenesis. However, accurately identifying and analyzing cell cycle dynamics in scRNA-seq data remains challenging due to the complexity of the data and the subtle differences between cell states. To address this challenge, we developed the integrated Sinusoidal and Piecewise AutoEncoder (SPAE), an autoencoder-based piecewise linear model, for characterizing the cell cycle dynamics and cell states in scRNA-seq data. Compared with existing methods, SPAE demonstrates substantially improved accuracy and robustness in cell cycle characterization. Additionally, SPAE can accurately predict cancer cell cycle transitions and effectively facilitate the removal of cell cycle effects from gene expression data. SPAE is available for non-commercial use at https://github.com/YaJahn/SPAE.
Kawasaki, R.; Takemoto, K.; Hamano, M.
Show abstract
Direct reprogramming (DR) converts somatic cells directly into target cell types while bypassing an intermediate pluripotent state, such as induced pluripotent stem cells. In practice, DR is achieved by transfecting multiple transcription factors (TFs); prior research has shown that combining microRNAs (miRNAs) with TFs further improves reprogramming efficiency. However, experimentally identifying effective TFs and miRNA combinations is difficult and costly, underscoring the need for robust in silico prediction approaches. We developed a graph neural network-based method to predict TFs that induce DR across diverse human cell types while explicitly modeling miRNA-mediated transcriptional regulation. By constructing a gene regulatory network integrating TF-target gene, TF-miRNA, miRNA-target gene, and gene-gene interactions, we implemented a Graph Attention Network v2 that predicts DR-inducing TFs while learning interaction importance and capturing transcriptional activation and repression. This approach outperformed existing methods in predicting experimentally validated DR-inducing TFs. Moreover, high-ranking predictions for previously unexplored tissues included TFs known to be associated with the development of the corresponding tissues, supporting the biological relevance of the results. Overall, the proposed method provides a practical in regenerative medicine.
Zhang, X.; Fang, Z.; Tang, K.; Chen, H.; Li, J.
Show abstract
Targeted drug therapies offer a promising approach for treating complex diseases, with combinational drug therapies often employed to enhance therapeutic efficacy. However, unintended drug-drug interactions may undermine treatment outcomes or cause adverse side effects. In this work, we propose a novel joint learning framework for the simultaneous prediction of effective drug combinations and drug-drug interactions, based on coupled tensor-tensor factorization. Specifically, we model drug combination therapies and DDI by representing drug-drug-disease associations and drug-drug interaction profiles as coupled three-way tensors. To address the challenges of data incompleteness and sparsity, the proposed model integrates auxiliary drug similarity information, such as chemical structure similarities, drug-specific side effects, drug target profiles, and drug inhibition data on cancer cell lines, within a multi-view learning frame-work. For optimization, we adopt a modified Alternating Direction Method of Multipliers (ADMM) algorithm that ensures convergence while enforcing non-negativity constraints. In addition to standard tensor completion tasks, we further evaluate the proposed method under a more realistic new-drug prediction setting, where all interactions involving a previously unseen drug are withheld. This scenario closely aligns with real-world applications, in which reliable predictions for emerging or under-studied compounds are essential. We evaluate the proposed method on a comprehensive dataset compiled from multiple sources, including DrugBank, CDCDB, SIDER, and PubChem. Our experiments show that SI-ADMM maintains robust performance and achieves the best results comparing to other tensor factorization approaches, with or without auxiliary information, particularly in the new-drug prediction setting. The implementation of our method is publicly available at: https://github.com/Xiaoge-Zhang/SI-ADMM.
Soboleva, A.; Honasoge, K. S.; Molnarova, E.; Dingemans, A.-M.; Grossmann, I.; Rezaei, J.; Stankova, K.
Show abstract
Evolutionary cancer therapy (ECT) applies principles of evolutionary game theory to prolong the effectiveness of cancer treatment by curbing the development of treatment resistance. It was shown to increase time to progression while decreasing the cumulative drug dose. ECT individually tailors treatment schedules for patients based on their cancer dynamics and, thus, requires regular follow-up and precise measurements of the cancer burden. The current literature on ECT often overlooks clinical realities, such as rather long intervals between tests, possible appointment delays and measurement errors, in the development of the treatment protocols. In this study, we assess the clinical feasibility of ECT for metastatic non-small cell lung cancer (NSCLC). We create virtual patients with cancer dynamics described by the polymorphic Gompertzian model, based on data from the START-TKI clinical trial. We assess the effects of longer test intervals, measurement error and appointment delays on the expected time to progression under the evolutionary therapy protocols. We show that a higher containment level, although it increases time to progression in the models predictions, may lead to premature treatment failure in the presence of measurement error and appointment delay. Further, we show that the ECT protocol with a single containment bound is more robust to the clinical realities than the protocol with two bounds. Finally, we show that a dynamically adjusted treatment protocol can be beneficial for individual patients, but requires a thorough follow-up. This study contributes to the design of a clinical trial and the future clinical implementation of evolutionary therapy for NSCLC.